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Abstract — Message passing algorithms have proved surpris- 
ingly successful in solving hard constraint satisfaction problems 
on sparse random graphs. In such applications, variables are 
fixed sequentially to satisfy the constraints. Message passing 
is run after each step. Its outcome provides an heuristic to 
make choices at next step. This approach has been referred 
to as 'decimation,' with reference to analogous procedures in 
statistical physics. 

The behavior of decimation procedures is poorly understood. 
Here we consider a simple randomized decimation algorithm 
based on belief propagation (BP), and analyze its behavior on 
random ^-satisfiability formulae. In particular, we propose a 
tree model for its analysis and we conjecture that it provides 
asymptotically exact predictions in the limit of large instances. 
This conjecture is confirmed by numerical simulations. 

I. Introduction 

An instance of a constraint satisfaction problem [1] con- 
sists of n variables x — {x\,...,x n ) and to constraints 
among them. Solving such an instance amounts to finding an 
assignment of the variables that satisfies all the constraints, 
or proving that no such assignment exists. A remarkable 
example in this class is provided by fc-satisfiability, where 
variables are binary, xi € {0, 1}, and each constraint requires 
fc of the variables to be different from a specific fc-uple. Ex- 
plicitly, the a-th constraint (clause), a 6 [to] = {1, . . . , to} 
is specified by fc variable indexes ii(a), . . . , ik(a) € [n], 
and k bits Zi(a), . . . , Zk(a) £ {0, 1}. Clause a is satis- 
fied by assignment x if and only if (x^t a \, . . . : x ik ^) ^ 
(21(a), . . . , z k (a)). 

A constraint satisfaction problem admits a natural factor 
graph [2] representation, cf. Fig. [T] Given an instance, each 
variable can be associated to a variable node, and each 
constraint to a factor node. Edges connect factor node a € 
F = [to] to those variable nodes i G V = [n] such that the a- 
th constraint depends in a non-trivial way on variable Xi. For 
instance, in the case of fc-satisfiability, clause a is connected 
to variables i\{a), . . . , i% (a). If the resulting graph is sparse, 
fast message passing algorithms can be defined on it. 

Although constraint satisfaction problems are generally 
NP-hard, a large effort has been devoted to the development 
of efficient heuristics. Recently, considerable progress has 
been achieved in building efficient 'incomplete solvers' [3]. 
These are algorithms that look for a solution but, if they do 
not find one, cannot prove that the problem is unsolvable. A 
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particularly interesting class is provided by message passing- 
guided decimation procedures. These consist in iterating the 
following steps: 

1) Run a message passing algorithm. 

2) Use the result to choose a variable index i e V, and a 
value x* for the corresponding variable. 

3) Replace the constraint satisfaction problem with the 
one obtained by fixing Xi to x*. 

The iteration may stop for two reasons. In the first case a 
contradiction is produced: the same variable Xi appears in 
two constraints whose other arguments have already been 
fixed, and that are satisfied by distinct values of Xi. If 
this does not happen, the iteration stops only when all the 
variables are fixed and a solution is found. Notice that earlier 
algorithms, such as unit clause propagation (UCP) [4], [5] 
did not used message passing in step 2, and were not nearly 
as effective. 

Random constraint satisfaction problems are a useful 
testing ground for new heuristics. For instance, random k- 
satisfiability is the distribution over fc-SAT formulae defined 
by picking a formula uniformly at random among all the 
ones including m clauses over n variables. Decimation 
procedures of the type sketched above proved particularly 
successful in this context. In particular survey propagation- 
guided decimation [6], [7] outperformed the best previous 
heuristics based on stochastic local search [3]. More recently 
belief propagation-guided decimation was shown empirically 
to have good performances as well [8]. 

Unfortunately, so far there exists no analysis of message- 
passing guided decimation. Our understanding almost en- 
tirely relies on simulations, even for random instances. 
Consequently the comparison among different heuristics, as 
well as the underpinnings of their effectiveness are somewhat 
unclear. In this paper we define a simple class of randomized 
message passing-guided decimation algorithms, and present 
a technique for analyzing them on random instances. The 
technique is based on the identification of a process on 
infinite trees that describes the evolution of the decimation 
algorithm. The tree process is then analyzed through an 
appropriate generalization of density evolution [14]. Our 
approach is close in spirit to the one of [9]. While it applies 
to a large class of random constraint satisfaction problems 
(including, e.g. coloring of random graphs), for the sake 
concreteness, we will focus on random fc-SAT. 

We expect the tree process to describe exactly the al- 
gorithm behavior in the limit of large instances, n — > 00. 
While we could not prove this point, numerical simulations 
convincingly support this conjecture. Further, non-rigorous 
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Fig. 1. Factor graph of a small 3-SAT instance. Continuous edges 
correspond to Zj (a) = 0, and dashed ones to Zj (a) = 1. The corresponding 
Boolean formula reads (xi V X2 V £3) A (~x~2 V 14 V X5) A (x% V x$ V 
^s) A (x 3 V17V x 8 ). 

predictions based on tree calculations have been repeatedly 
successful in the analysis of random fc-satisfiability. This ap- 
proach goes under the name of 'cavity method' in statistical 
mechanics [6]. 

The paper is organized as follows. Section UD contains 
some necessary background and notation on random fc-SAT 
as well as a synthetic discussion of related work. In Section 
Hill we define the decimation procedure that we are going 
to analyze. We further provide the basic intuition behind the 
definition of the tree model. The latter is analyzed in Section 
IIVI and the predictions thus derived are compared with nu- 
merical simulations in Section [Vl Finally, some conclusions 
and suggestions for future work are presented in Section |VT] 
Proofs of several auxiliary lemmas are omitted from this 
extended abstract and deferred to technical appendices. 

II. Random fc-SAT and message passing: 
Background and related work 

As mentioned above, random k-SAT refers to the uniform 
distribution over fc-SAT instances with m constraints over n 
variables. More explicitly, each constraint is drawn uniformly 
at random among the 2 fc (^) possible ones. We are interested 
here in the limit n, m — » 00 with rn/n = a fixed. 

Consider the factor graph G of a random fc-SAT formula, 
endowed with the graph-theoretic distance. Namely, the 
distance of two variable nodes d(i,j) is the length of the 
shortest path leading from i to j on G. It is well known 
[10] that, in the large size limit, any finite neighborhood of 
a random node i converges in distribution to a well defined 
random tree. This observation will be the basis of our tree 
analysis of the decimation process, and is therefore worth 
spelling it out in detail. Let B(i,£) be the subgraph induced 
by all the vertices j 6 G, such that d(i,j) < I. Then 
B(i,£) — > T(£) as n — » oo, where T(£) is the random 
rooted (factor) tree defined recursively as follows. For I = 0, 
T(£) is the graph containing a unique variable node. For any 
£ > 1, start by a single variable node (the root) and add 
/ = Poisson(afc) clauses, each one including the root and 
k — 1 new variables (first generation variables). If £ > 2, 
generate an independent copy of T(£ — 1) for each variable 
node in the first generation and attach it to them. The values 
Zj{a) that violate clause a are independently chosen in {0, 1} 



with equal probability. It is easy to see that the limit object 
T(oo) is well defined and is an infinite tree with positive 
probability if a > l/fc(fc — 1). 

We let a s (fc) be the largest value of a such that random 
fc-SAT instances admit with high probability a solution. It is 
known [11] that a s (fc) = 2 fe log 2 — 0(k). A sharp conjecture 
on the value of a s (fc) has been put forward in [6] on the 
basis of statistical physics calculations, implying a s (fc) » 
4.267, 9.93, 21.12 for (respectively) k = 3, 4, 5 and a s (k) = 
2 fc log2- i(l + log2) + 0(2- fc ) for large fc [12]. 

Simple heuristics have been analyzed thoroughly [5] and 
proved to find a solution with probability bounded away 
from if a < const 2 fc /fc. Here the proportionality constant 
depends on the specific heuristic. 

To the best of our knowledge, the first application of 
message passing algorithms to fc-satisfiability is reported in 
[13]. In this early study BP was mostly applied in a one-shot 
fashion (as in iterative decoding of sparse graph codes [14]), 
without decimation. By this we mean that belief propagation 
is run, and resulting marginal probabilities are used to guess 
the values of all variables at once. However the probability 
of success of the one-shot algorithm is exponentially small: 
there are 9(n) isolated constraints, whose variables have 
non-trivial marginal probabilities, each of them is hence 
violated with finite probability in the one-shot assignment. 

Statistical mechanics methods allowed to derive a very 
precise picture of the solution set [15], [6], [8]. This inspired 
a new message passing algorithm dubbed survey propa- 
gation [7]. In conjunction with decimation, this algorithm 
allowed to solve random instances of unprecedentedly large 
sizes, in difficult regimes of a and fc. 

A natural way of introducing belief propagation for fc- 
satisfiability is to consider the uniform distribution over 
solutions (assuming their existence). Let us denote by da — 
{ii(a), . . . , ifc(a)} the set of variable nodes on which the a-th 
constraint effectively depends, for any subset U of the vari- 
able nodes their partial assignment Xjj — {xi \i 6 U}, and 
w a (x da ) = l{(> il(a) , . . . ,x ik(a) ) ^ (zx(a),. . . ,z k (a))} the 
indicator function of the event 'clause a is satisfied.' The 
uniform distribution over the solutions can thus be written 

Mfe) = \ ]J W a(Xda) ■ (1) 

In [16] it was proved that for a < (21ogfc)/fc[l + o(l)], 
BP computes good approximations of the marginals of fi, 
irrespective of its initialization. It is clear from empirical 
studies [17], [18] that the 'worst case' argument used in 
this estimate (and in other papers on belief propagation [19], 
[20]) is far too pessimistic. 

In Ref. [21] a simple message passing algorithm, warn- 
ing propagation (see below), was analyzed for a modified 
('planted') ensemble of random formulae. The algorithm 
was proved to converge and find solutions for large enough 
density a (see also [22], [23]). Both the ensemble and the 
algorithm are quite different from the ones treated in this 
paper. 



Further, the definition and analysis of a 'Maxwell decoder' 
in [24], [25], is closely related to the approach in this 
paper. Let us recall that the Maxwell decoder was a (mostly 
conceptual) algorithm for implementing maximum likelihood 
decoding of LDPC codes over the erasure channel. The 
treatment in [24], [25] applies almost verbatim to a simple 
constraint satisfaction problem known as XORSAT. The 
generalization in the present paper is analogous to the one 
from the erasure to a general binary memoryless symmetric 
channel. 

Finally, let us mention that BP decimation can be an in- 
teresting option in engineering applications, as demonstrated 
empirically in the case of lossy source coding [26], [27]. 

III. A SIMPLE DECIMATION PROCEDURE 
A. Belief propagation 

Let us recall the definition of BP for our specific setup ([2], 
[28] are general references). BP is a message passing algo- 
rithm: at each iteration messages are sent from variable nodes 
to neighboring clause nodes and vice versa. To describe the 
message update equations, we need some more notation. As 
in the case of factor nodes, we shall call di the set of factors 
that depends on the variable Xi. If i G da, say i — ii(a), 
we denote z(i,a) — zi(a) the value of Xi which does not 
satisfy the a-th clause. For a pair of adjacent variable (i) 
and factor (a) nodes (i.e. i S da), let us call d+i{a) (resp. 
d-i(a)) the set of factor nodes adjacent to i, distinct from a, 
that agrees (resp. disagrees) with a on the satisfying value 
of Xi. In formulae, d+i(a) = {b G di\ a\z(i,b) = z(i,a)} 
and d-i(a) = {b G di\z(i, b) = 1 — z(i, a)}. 

It is convenient to use log-likelihood notations for mes- 
sages as is done in iterative decoding [14], with two caveats: 
(1) We introduce a factor 1/2 to be consistent with physics 
notation; (2) The message from variable node i to factor 
node a corresponds to the log-likelihood for Xi to satisfy /not- 
satisfy clause a (rather than to be 0/1). 



Let {/i^„}, {itjjj!^} denote the messages that are passed 
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at time r along the directed edges i — > a and a 
i £ V, and a G F. The update equations read 
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with e = (this parameter is introduced for the discussion 
in Sec.©. 

For i g V, let d+i be the subset of clauses that are satisfied 
by Xi = 0, and d-i the subset satisfied by Xi — 1. Then the 
BP estimate for the marginal of Xi under the measure /i( • ) 
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B. Unit clause and warning propagation 



During the decimation procedure a subset U of the vari- 
ables are fixed to specific values, collectively denoted as 
xfj. This has some direct implications. By this we mean 
that for some other variables Xj, j G V \ U, it follows 
from 'unit clause propagation' (UCP) that they take the 
same value in all of the solutions compatible with the partial 
assignment x^. We will say that these variables are directly 



implied by the condition x L 



Let us recall that unit 



clause propagation corresponds to the following deduction 
procedure. For each of the fixed variables Xi, and each of 
the clauses a it belongs to, the value x* can either satisfy 
clause a, or not. In the first case clause a can be eliminated 
from the factor graph. In the second a smaller clause with 
one less variable is implied. In both cases variable Xi is 
removed. It can happen that the size of a clause gets reduced 
to 1, through this procedure. In this case the only variable 
belonging to the clause must take a definite value in order 
to satisfy it. We say that such a variable is directly implied 
by the fixed ones. Whenever a variable is directly implied, 
its value can be substituted in all the clauses it belongs to, 
thus allowing further reductions. 

The process stops for one of two reasons: (1) All the fixed 
or directly implied variables have been pruned and no unit 
clause is present in the reduced formula. In this case we 
refer to all variables that appeared at some point in a unit 
clause as directly implied variables. (2) Two unit clauses 
imply different values for the same variable. We will say 
that a contradiction was revealed in this case: no solution x 
of the formula can verify the condition x v = Xy. 

A key element in our analysis is the remark that UCP 
admits a message passing description. The corresponding 
algorithm is usually referred to as warning propagation 
(WP) [29]. The WP messages (to be denoted as fj£2 , u alt) 
take values in {I, 0}. The meaning of ilj^L = I (respectively 

(r) 

U a^i = 0) i s: ' var i a ble Xi is (resp. is not) directly implied 
by clause a to satisfy it.' For variable-to-factor messages, the 

(r) (r) 

meaning of i) i _ ta — I (respectively i) i _ ta — 0) is: 'variable 
Xi is (resp. is not) directly implied, through one of the clauses 
b G di\a, not to satisfy clause a.' 

We want to apply WP to the case in which a part of the 
variables have been fixed, namely Xi = x* for any i G U C 
V. In this case the WP rules read 
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or i G U and x 
otherwise, 
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BP-Decimation (fc-SAT instance G) 



1: initialize BP messages {/i,_» = 0, u a ^i = 0}; 

2: initialize WP messages {f),_» a = 0,u a _»i = 0}; 

3: initialize {7 = 0; 

4: for t = 1, ... ,ra 

5: run BP until the stopping criterion is met; 

6: choose i G V \ L7 uniformly at random; 

7: compute the BP marginal fi(xj); 

8: choose x* distributed according to v\\ 

9: fix a;, = x* and set f7 <— U U {«}; 

10: run WP until convergence; 

11: if a contradiction is found, return FAIL; 

12: end 

13: return x* . 

TABLE I 



The Belief Propagation-guided decimation algorithm. 

In the following we shall always assume that WP is initial- 
ized with \)fX a = 0> u i°! a = f° r eacrl edge (ia) E 
It is then easy to prove that messages are monotone in 
the iteration number (according to the ordering < I). 
In particular the WP iteration converges in at most 0(n) 
iterations. We denote {u^^} the corresponding fixed point 
messages, and say that i G V\U is WP-implied by the fixed 
variables if there exist a G di such that uj£^ = I. Then the 
equivalence between UCP and WP can be stated in the form 
below. 

Lemma 1. Assume a partial assignment to be given for 
U C V. Then 

1) The fixed point WP messages {ll^,-} do not depend on 
the order of the WP updates ( as long as any variable 
is updated an a priori unlimited number of times). 

2) i G V \ U is directly implied iff it is WP-implied. 

3) UCP encounters a contradiction iff there exists i € V, 
a G d+i, b G d-i such that = ufc| = I. 

For the clarity of what follows let us emphasize the 
terminology of fixed variables (those in U) and of directly 
implied variables (not in U, but implied by x^j though UCP 
or WP). Finally, we will call frozen variables the union of 
fixed and directly implied ones, and denote the set of frozen 
variables by W C V . 

C. Decimation 

The BP-guided decimation algorithm is defined by the 
pseudocode of Table |T] There are still a couple of elements 
we need to specify. First of all, how the BP equations (|2), 
(01 are modified when a non-empty subset U of the variables 
is fixed. One option would be to eliminate these variables 
from the factor graph, and reduce the clauses they belong 
to accordingly. A simpler approach consists in modifying 
Eq. (O when i G U. Explicitly, if the chosen value x* 
satisfies clause a, then we set h\_, a = +oo. If it does not, 
we set III' = — oo. 

Next, let us stress that, while WP is run until convergence, 
a not-yet defined 'stopping criterion' is used for BP. This will 



be precised in Section [V] Here we just say that it includes 
a maximum iteration number r max , which is kept of smaller 
order than 0(n). 

The algorithm complexity is therefore naively 0(n 3 r max ). 
It requires n cycles, each involving: (1) at most r max BP it- 
erations of 0(n) complexity and (2) at most n WP iterations 
of complexity 0(n). It is easy to reduce the complexity to 
0(n 2 r max ) by updating WP in sequential (instead of paral- 
lel) order, as in UCP. Finally, natural choice (corresponding 
to the assumption that BP converges exponentially fast) is to 
take r max = 0(\ogn), leading to 0(n 2 logn) complexity. 

In practice WP converges after a small number of iter- 
ations, and the BP updates are the most expensive part of 
the algorithm. This could be reduced further by using the 
fact that fixing a single variable should produce only a small 
change in the messages. Ref. [7] uses this argument for a 
similar algorithm to argue that 0(n log n) time is enough. 

D. Intuitive picture 

Analyzing the dynamics of BP-decimation seems ex- 
tremely challenging. The problem is that the procedure is 
not 'myopic' [5], in the sense that the value chosen for 
variable Xi depends on a large neighborhood of node i in the 
factor graph. By analogy with myopic decimation algorithms 
one expects the existence of a critical value of the clause 
density agp^fc) such that the algorithm finds a solution with 
probability bounded away from for a < aBPd(^)' while 
it is unsuccessful with high probability for a > aBPd(^)- 
Notice that, if the algorithm finds a solution with positive 
probability, restarting it a finite number of times shoulcQ 
yield a solution with probability arbitrarily close to 1. 

We shall argue in favor of this scenario and present an 
approach to analyze the algorithm evolution for a smaller 
than a spinodal point a sp i n (fc). More precisely, our analysis 
allows to compute the asymptotic fraction of 'directly im- 
plied' variables after any number of iterations. Further, the 
outcome of this computation provides a strong indication that 
a spin(fc) < a BFd(k)- Both the analysis, and the conclusion 
that a sp in(fc) < aBPd(fc) are confirmed by large scale 
numerical simulations. 

Our argument goes in two steps. In this section we 
show how to reduce the description of the algorithm to a 
sequence of 'static' problems. The resolution of the latter 
will be treated in the next section. Both parts rely on some 
assumptions on the asymptotic behavior of large random 
fc-SAT instances, that originate in the statistical mechanics 
treatment of this problem [6], [8]. We will spell out such 
assumptions along the way. 

As a preliminary remark, notice that the two message 
passing algorithms play different roles in the BP-decimation 
procedure of Table U BP is used to estimate marginals of 
the uniform measure fi( ■ ) over solutions, cf. Eq. ([TJ, in the 
first repetition of the loop. In subsequent repetitions, it is 
used to compute marginals of the conditional distribution, 

'A caveat: here we are blurring the distinction between probability with 
respect to the formula realization and the algorithm realization. 



given the current assignment x_ v = xfj. These marginals are 
in turn used to choose the values {x*} of variables to be 
fixed. WP is on the other hand used to check a necessary 
condition for the current partial assignment to be consistent. 
Namely it checks if it induces a contradiction on directly 
implied variables. In fact, it could be replaced by UCP, and, 
in any case, it does not influence the evolution of the partial 
assignment x$j. 

Let us introduce some notation: (i(l), z(2), . . . , i(n)) is 
the order in which variables are chosen at step 5 in the 
algorithm, XJ t = {i(l), . . . ,i(t)} the set of fixed variables 
at the beginning of the t + 1-th repetition of the loop, and 
Wt the frozen variables at that time (i.e. the union of Ut and 
the variables directly implied by x^, .). 

We begin the argument by considering an 'idealized' 
version of the algorithm where BP is replaced by a black 
box, that is able to return the exact marginal distribution 
of the measure conditioned on the previous choices, namely 
Vi{ ■ ) = Mi 1 1/ ( ' Isjy)- Let us point out two simple properties 
of this idealized algorithm. First, it always finds a solution 
if the input formula is satisfiable (this will be the case with 
high probability if we assume a < a s (k)). In fact, assume 
by contradiction that the algorithm fails. Then, there has 
been a last time t, such that the fc-SAT instance has at least 
one solution consistent with the condition x TT = xt T 

— ft— 1 — Ut-i' 

but no solution under the additional constraint Xi = x* 
for i = i(t). This cannot happen because it would imply 
Milft-i ( x i _i) = 0' aR d if tn i s is the case, we would not 
have chosen x* in step 8 of the algorithm. 

The second consequence is that the algorithm output con- 
figuration x* is a uniformly random solution. This follows 
from our assumption since 



pfe*ii(-)} = ii"i(*)W= 

£=1 
n 

t=i 

Therefore, the distribution of the state of the idealized 
algorithm after any number t of decimation steps can be 
described as follows. Pick a uniformly random solution 
x*, and a uniformly random subset of the variable indexes 
Ut C V, with \Ut\ = t. Then fix the variables i e Ut to 
take value Xi = x*, and discard the rest of the reference 
configuration x* (i.e. the bits x* for j £ Ut). 

We now put aside the idealized algorithm and consider 
the effect of fixing the i-th variable i(t) to x*, t y Three cases 
can in principle arise: (i) xut\ was directly implied to be 
equal to 1 — x*, t <. by Xjj t i and a contradiction is generated. 
We assume that BP is able to detect this direct implication 
and avoid such a trivial contradiction; (ii) xu t ) was directly 
implied to x*,^ by x.u f _ 1 - The set of frozen variables remains 
the same, Wt = Wt—i, as this step is merely the actuation 
of a previous logical implication; (Hi) i(t) was not directly 
implied by Xjj _ . This is the only interesting case that we 
develop now. 



Let us call Zt = Wt \ Wt—i the set of newly frozen 
variables after this fixing step. A moment of reflection shows 
that Zt contains i(t) and that it forms a connected subset of 
V in G. Consider now the subgraph Gt C G induced by Z t 
(i.e. G t = (Zt,F t ,E t ) where F t is the set of factor nodes 
having at least one adjacent variable in Z t , and E t is the set 
of edges between Z t and F t ). A crucial observation is the 
following: 

Lemma 2. If Gt is a tree, no contradiction can arise during 
step t. 

From this lemma, and since the factor graph of a typical 
random formula is locally tree-like, one is naturally lead to 
study the size of Z t , i.e. of the cascade of newly implied 
variables induced by fixing the i-th variable. If this size 
remains bounded as n — > oo, then Gt will typically be a tree 
and, consequently, contradictions will arise with vanishingly 
small probability during one step. If on the other hand the 
size diverges for infinitely large samples, then Gt will contain 
loops and opens the possibility for contradictions to appear. 

In order to compute the typical size of Z t , we notice that 
\Zt \ = \Wt \ — | Wi_i |, and consider a t of order n, namely 
t = nd. If we let (f>(9) = ¥J\W n e\/n denote the fraction of 
frozen variables when a fraction 9 of variables have been 
fixed, then under mild regularity conditions we have 

me) 



lim m\Z ne \\ = 
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Of course <p will be an increasing function of 8. The argument 
above implies that, as long as its derivative remain finite for 
9 6 [0, 1], then the algorithm finds a solution. When the 
derivative diverges at some point 8*, then the number of 
direct implications of a single variable diverges as well. The 
spinodal point a sp in(fc) i s defined as be the smallest value 
of a such that this happens. 

The expectation in the definition of <fi(9) is with respect 
to the choices made by the real BP algorithm in the first 
n6 steps, including the small mistakes it necessarily makes. 
Our crucial hypothesis is that the location of a sp i n (fc) does 
not change (in the n — > oo limit) if (f)(9) is computed along 
the execution of the idealized decimation algorithm. In other 
words we assume that the cumulative effect of BP errors 
over n decimation steps produces only a small bias in the 
distribution of x*. For a > aBPd(^) this hypothesis is no 
longer consistent, as the real BP algorithm fails with high 
probability. 

Under this hypothesis, and recalling the description of the 
state of the idealized algorithm given above, we can compute 
4>(8) as follows. Draw a random formula on n variables, a 
uniformly random 'reference' solution x*, a subset U of nd 
variable nodefl Let </>„ (9) be the probability that a uniformly 
random variable node i is frozen, i.e. either in U or directly 



implied by x^. Then 



lim,. 



b n (9). In the next 



Section this computation will be performed in the random 
tree model J(l) of Sec. HH 

2 in the large n limit one can equivalently draw U by including in it each 
variable of V independently with probability 8. 



IV. The tree model and its analysis 

Let us consider a fc-satisfiability formula whose factor 
graph is a finite tree, and the uniform measure /i over 
its solutions (which always exist) defined in Eq. ([T). It 
follows from general results [2] that the recursion equa- 
tions (1213b have a unique fixed-point, that we shall denote 
{hi^ a , u a ^i}. Further the BP marginals Vi( ■ ), cf. Eq. ©, 
are the actual marginals of fi. Drawing a configuration x from 
the law fi is most easily done in a recursive, broadcasting 
fashion. Start from an arbitrary variable node i and draw 
Xi with distribution 2/j. Thanks to the Markov property of 
fi, conditional on the value of Xi, Xy\j can be generated 
independently for each of the branches of the tree rooted at 
i. Namely, for each a € di, one draws Xg^ from 

K%da\i\ x i) = ^w a {x t ,x daXi ) Yl Vj^a(Xj) ■ (9) 

Here z is a normalization factor and z/,-_> a (-) denotes the 
marginal of the variable x% in the amputated factor graph 
where factor node a has been removed (this is easily ex- 
pressed in terms of the message /&»_«,)• Once all variables j 
at distance 1 from i have been generated, the process can be 
iterated to fix variables at distance 2 from i, and so on. It 
is easy to realize that this process indeed samples a solution 
uniformly at random. 

Following the program sketched in the previous Section, 
we shall study the effect of fixing a subset of the variables 
to the value they take in one of the solutions. We first state 
the following lemma. 

Lemma 3. Suppose U is a subset of the variables of a tree 
formula, and let x* be a uniformly random solution. The 
probability that a variable i ^ U is directly implied by x^ 
reads 

f<(o){i- n (i-^i)}+^(i){i- n (i-«o^i)}, 



(10) 



where the new messages {u a ^i,hi^ a } are solutions of 

J 1 if 3 G U 

« = { 1 II (l-«6-»i) otherwise ( n ) 

I bed-j(a) 

^ = n f 1 "^^" ^) ■ ^ 

j£da\l ^ ' 

We consider now a random tree factor graph and a random 
set of fixed variables U. 

Lemma 4. Consider a random tree formula T(£) obtained 
from the construction of Section [22] and a random subset U 
of its variable nodes defined by letting j E U independently 
with probability 9 for each j. Finally, let x* be a uniformly 
random solution ofT(£). Then the probability that the root 
ofT(£) is frozen (either fixed or directly implied by x^j) is 



where Eg[-\ denotes expectation with respect to the distribu- 
tion of (h, K)g. This is a (vector) random variable defined by 
recurrence on £ as 

I i+ i- i- \ 

{h,h)A - $>r, i - cll( 1 - . < 14 > 

y i=l i=l i=l J 

(u,u) i+1 = ^f(hi, ...,h k -i), JJ - — t ^ nh hl h}j , (15) 

with initial condition (u,u)i—o — (0,0) with probability 1. 
In this recursion l + and /_ are two independent Poisson 
random variables of parameter ak/2, £ is a random variable 
equal to (resp. 1) with probability 9 (resp. 1 — 9), the 
{(uf , Ui), {u~ , u^)} and (hi, hi) are independent copies 
of, respectively, (u, u)e and (h, h)g. 

To obtain a numerical estimate of the function <fi tree (9) = 
lim^oo <j) t l ee (9) we resorted to sampled density evolution 
(also called 'population dynamics' in the statistical physics 
context [6]), using samples of 10 5 elements and k = 4 as a 
working example, see Fig. [2] For small values of a, <f> tiee, (9) 
is smoothly increasing and slightly larger than 9. Essentially 
all frozen variables are fixed ones, and very few directly 
implied variables appear. Moreover the maximal slope of 
the curve is close to 1, implying that the number of new 
frozen variables at each step, Z t , remains close to 1. As a 
grows, ^> tree (0) becomes significantly different from 9, and 
the maximal slope encountered in the interval 9 e [0, 1] 
gets larger. At a value o£tfL(k) the curve tree (6') acquires 
a vertical tangent at ^*(c^^), signaling the divergence of 
the size of the graph of newly implied variables. Density 
evolution gives us ct l ^ n (k = 4) w 8.05, with an associated 
value of 0* w 0.35. For a > a£?(fc) the curve tree (6>) 
has more than one branch, corresponding to the presence of 
multiple fixed points for 9 € [9o(a), 0*(a)]. In analogy with 
[25], we expect the evolution of the algorithm to be described 
by picking (for each 9) the lowest branch of <fi tree (9). The 
resulting curve has a discontinuity at 9„(a), which is a slowly 
decreasing function of a. 

We expect the tree computation to provide the correct 
prediction for the actual curve (f)(9) (i.e. 4> tiee (9) = (f)(9)) for 



$ ee (9) =E £ (l-tanhft)ft 



(13) 



a large range of the satisfiable regime, including [0, a^^(fc)]. 
As a consequence, we expect a sp i n (fc) = a^ B (k) and BP 
decimation to be successful up to a sp i n (fc). Similar tree 
computations are at the basis of a number of statistical 
mechanics computations in random fc-SAT and have been 
repeatedly confirmed by rigorous studies. 

The relation between tree and graph can be formalized in 
terms of Aldous [30] local weak convergence method. Fix 
a finite integer I and consider the finite neighborhood B(£) 
of radius I around an arbitrarily chosen variable node of 
an uniformly drawn factor graph G on n variables. Denote 
by / i B(f),n(') me l aw °f 3Lb{C) wnen x is a uniformly 
random solution. We proceed similarly in the random tree 
ensemble. Draw a random tree T(L) with L >£, let T(£) its 
first £ generations, and /uS^ L ( ■ ) the distribution of x T ^y 
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Fig. 2. Fraction of frozen variables as a function of the fraction of fixed 
variables. Comparison between the tree model and the algorithmic numerical 
results, for 4-satisfiability formulas with n = 4000, a = 7 and a = 8.4. 



Considerations building on the field of statistical mechanics 
of disordered systems leads to the following hypothesis. 



Conjecture 1. There exists a sequence a c (k) such 
that = f or a ^ a < a c (k), i.e. 

(B(£), /iB(^),n( " )) an d (T(£), l( ' )) have the same 
weak limit. A precise determination of a c (k) was presented 
in [8], yielding a c (k) w 3.86,9.55,20.80 for, respectively, 
k = 3,4,5, and a c (k) = 2 fc log2 - flog 2 + 0(2- k ) at 
large k. 



Local weak limits of combinatorial models on random 
graphs were recently considered in [31]. For a generalized 
conjecture in the regime [a c (k), a s (k)] see [32]. 

A slightly stronger version of this conjecture would imply 
that (f>(6) = ^ tree (6'). As a consequence (following the dis- 
cussion in previous section) the tree model would correctly 
describe the algorithm evolution. 

V. Numerical simulations 

In order to test the validity of our analysis we performed 
numerical simulations of the pseudo-code of Table [I] Let 
us give a few further details on its implementation. The 
BP messages are stored as {tanh/i^ a ,tanhw a ^}. Am- 
biguities in the update rule (f3]l arises when tanhu;,^ = 
tanhM c ^i = 1 with b 6 d + i(a) and c € cLi(a). Because of 
numerical imprecisions this situation can occur even before a 
contradiction has been detected by WP; such ambiguities are 
resolved by recomputing the incoming messages tanhitf,^ 
using the regularized version of Eq. with a small positive 
value of e (in practice we used e = 10 -4 ). 

As for the stopping criterion used in step 5, we leave the 
BP iteration loop if either of the two following criteria is 
fulfilled: (1), sup, | tanh h { [ ] - tanh/4 r_1) | < S, i.e. BP 
has converged to a fixed-point within a given accuracy; (2) 
A maximal number of iterations r max fixed beforehand has 
been reached. In our implementation we took S = 10 -10 and 
r max = 200. 

A first numerical check is presented in Fig. [2] The two 
dashed curves represent the fraction of frozen variables along 
the execution of the BP guide decimation algorithm, for two 
formulas of the 4-sat ensemble, of moderate size (n = 4000). 
The first formula had a ratio of constraints per variable 



Fig. 3. Probability of success of the BP decimation algorithm as a function 
of the clause density a in random 4-SAT. The vertical line corresponds to 
the threshold a S p m (4). Our analysis indicates that BP decimation finds a 
solution with probability bounded away from for a < cr S p m (4). 
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Fig. 4. Mean halting time for the BP decimation algorithm in random 
4-SAT. The vertical line corresponds to the threshold a S pj n (4). The mean 
is taken over unsuccessful runs. For a < ce S pj n (4) a large fraction of the 
runs is successful and do not contribute to the mean. 



a = 7 < a spm . In agreement with the picture obtained 
from the analytical computation, the algorithm managed to 
find a solution of the formula (no contradiction encountered) 
and the measured fraction of frozen variables follows quite 
accurately the tree model prediction. The second formula was 
taken in the regime a sp j n < a = 8.4 < a c . The algorithm 
halted because a contradiction was found, after roughly the 
fraction f?„ (computed from the tree model) of variables has 
been fixed. The portion of the curve before this event exhibits 
again a rather good agreement between the direct simulation 
and the model. 

Figure [3] shows the probability of success of BP decima- 
tion in a neighborhood of a sp in(4) for random formulae of 
size n = 500, 1000, 2000. Each data point is obtained by 
running the algorithm on 1000 to 3000 formulae. The data 
strongly suggest that the success probability is bounded away 
from for a < a sp j n (fc), in agreement with our argument. 

Finally, in Figure [4] we consider the number of variables 
fixed by BP decimation before a contradiction is encountered. 
According to the argument in Section IIII-DI t*/n should 
concentrate around the location 0* of the discontinuity in 
(f)(6). This is in fact the point at which the number of 
variables directly implied by a fixed one is no longer 



bounded. The comparison is again encouraging. Notice that 
for a < a sp [ n (k) we do not have any prediction, and the 
estimate of t* concerns only a small fraction of the runs. 

To summarize, our simulations support the claim that, 
for a < a sp i n (fc) the success probability is strictly positive 
and the algorithm evolution follows the tree model. For 
a > a spm (fc) the main failure mechanism is indeed related 
to unbounded cascades of directly implied variables, after 
about nQ* steps. 

VI. Conclusions and future directions 

Let us conclude by highlighting some features of this work 
and proposing some directions for future research. It is worth 
mentioning that, as was also found in [8], random 3-sat has 
a qualitatively different behavior compared to random fc-sat 
with k > 4. In particular we did not found any evidence 
for the existence of a vertical tangent point in the k = 3 
function (f)(9) in the regime we expect to control through the 
tree computation, namely a < a c (3) « 3.86. 

Our analysis suggests that BP guided decimation is suc- 
cessful with positive probability for a < a sp i n (fc). Further 
we argued that this threshold can be computed through a 
tree model and evaluated via density evolution. Despite these 
conclusions are based on several assumptions, it is tempting 
to make a comparison with the best rigorous results on 
simple decimation algorithms. For k = 4 the best result was 
obtained by Frieze and Suen [33] who proved SCB (shortest 
clause with limited amount of backtracking) to succeed for 
a < 5.54. This is far from the conjectured threshold of 
BP decimation that is a spm (4) w 8.05. For large k, an 
asymptotic expansion suggests that 

« S pm(fc) = ey(l + 0(fc- 1 )) , (16) 

whereas SCB is known from [33] to reach clause densities 
of Cfe2 fc /fc, with Cfc — » 1.817 as k — > oo. A rigorous version 
of our analysis would lead to a constant factor improvement. 
On the other hand, the quest for an algorithm that provably 
solves random fc-SAT in polynomial time beyond a — 
0{2 k /k), is open. 

From a practical point of view the decimation strategy 
studied in this paper is not the most efficient one. A 
seemingly slight modification of the pseudo-code of Table H] 
consists in replacing the uniformly random choice of the 
variable to be fixed, privilegiating the ones with the most 
strongly biased marginals. The intuition for this choice is 
that these marginals are the less subject to the 'small errors' 
of BP. The numerical results reported in [8] suggest that this 
modification improves significantly the performances of the 
decimation algorithm; unfortunately it also makes its analysis 
much more difficult. 

This work was partially supported by EVERGROW, in- 
tegrated project No. 1935 in the complex systems initiative 
of the Future and Emerging Technologies directorate of the 
1ST Priority, EU Sixth Framework. 
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Appendix 

Proof of Lemma 1. The statement is completely analogous 
to the equivalence between message passing and peeling 
versions of erasure decoding for LDPC codes [14]. Since the 
proof follows the same lines as well, we will limit ourselves 
to sketch its main points. 

1) Let {u a ^i} and {u^_^} be two fixed points of WP. 
Then {min(u a ^i, u4_^)} is a fixed point as well. It follows 
that the 'minimal' fixed point is well defined and that it 
coincides with the limit of {uj^} irrespective of the order 
of WP updates. 

2) Consider the ordering {i(l),z(2), . . . ,i(q)} according 
to which variables are declared as directly implied within 
UCP For each s e {1, . . . , q} there is at least one unit clause 
involving only variable i(s) before this was declared. Call 
this a(s). Then use the same update order for WP, namely 
update, in sequence message u a ( s )^( s ), and all the messages 
f)i(s)^6 f° r b 7^ a(s). It is immediate to show that this 
leads to a fixed point, and the resulting WP-implied variables 
coincide with the directly implied variables. The proof is 
completed by using point 1. 

3) Consider the same ordering of variables used in point 
2 above. If there exists i G V, a e d+i, b £ d-i as in the 
statement, then UCP must have reduced both clauses a and b 
to a unit clause involving xi and requiring it to take different 
values. Viceversa if UCP produces such a situation, in the 



WP updates u^. 



,(r) 



I after some time r. 



□ 



Proof of Lemma 2. The same statement has been proved for 
the Maxwell decoder [25]. We therefore briefly recall the 
basic ideas used in that case. 

First of all the only WP messages changing from step 
t — 1 to step t (call these the 'new' messages) are the ones 
on the edges of the tree Gt, and directed outwards. As 
a consequence, no contradiction can arise because of two 
contradicting new messages, because no variable node has 
two incoming new messages. 

There could be, in line of principle, a contradiction be- 
tween a new and an old message. The crucial observation 
is that indeed any factor node in F t has at most two 
adjacent variable nodes in Z t (because otherwise if could 
not 'transmit' an implication). If a variable node i already 
receives some I message at time t — 1 from clause a, then 
it cannot receive any new message at time t from a different 
clause b. This because the message i — > b must already be 
I, and therefore clause b is already effectively 'reduced'. 



An alternative argument consists in considering the equiv- 
alent UCP representation. If Gt is a tree, then no variable 
appears twice in a unit clause, and therefore no contradiction 
arises. □ 

Proof of Lemma 3. Since we are dealing with a tree graph, 
equations ( II 1112b admit a unique solution, determined from 
the boundary condition h^ a = 1 (resp. /ij_> a = 0) if i is a 
leaf in U (resp. a leaf outside of U). The newly introduced 
messages have the following interpretation. Imagine running 
WP, cf. Eqs. ©, (|7]i to find which variables are directly 
implied by ly. Then u a ^j is the probability that u a ^j = I 
when iy is drawn conditional on Xj satisfying a. Further, 
hj^ a is the probability that f)j^ a = I when x^j is drawn 
conditional on xj not satisfying clause a. 

Now, suppose Xi has been fixed to x* drawn according 
to its marginal (hence the two terms in Eq. ( [Tol l) and 
a configuration x has been generated conditional on Xi, 
through the broadcast construction. Then the configuration 
of the variables in U is retained, x_u = Xy, and the rest of 
x* is discarded. The status (directly implied or not) of Xi is 
read off from the values of the messages u a ^i it receives. 
It is easy to convince oneself that Xi cannot be implied to 
take the value opposite to the one it took at the beginning 
of the broadcasting: by definition x^ is compatible with it. 
Equation ( flOb follows by computing the probability that at 
least one of the messages u a _^ is equal to I among the ones 
from clauses a that are satisfied by x*. 

Equation (fTTI) is derived by applying the same argument 
to the branch of the tree rooted at j and not including factor 
node a. Finally, to derive Eq. ( fT2l notice that, in order for 
variable xi to be directly implied to satisfy clause a, each of 
the variables j 6 da \l must be implied by the corresponding 
subtree not to satisfy a. From the above remark, this can 
happen only if none of the {x*} satisfies a. The probability 
of this event is easily found from (0 to be 

1 - tanh hj^ a 
2 ■ 

j£da\i 

□ 



n 



Proof of Lemma 4. Denote by p the root of T(£). Condi- 
tional on the realization of the tree and of the set U, the 
probability of a direct implication of the root is obtained by 
solving (O, (fj), ( fTTI ). ( flZb for the edges directed towards the 
root, which leads to couples of messages {(/ii_» a , hi^ a )} 
and {(u a ^i, u a ^i)} along the edges of T(£). Since T(£) 
and U are random these couples of messages are random 
variables as well. 

We claim that for £ > 1, the messages (ii a _> p , u a -> P ) sent 
to the root of T(^) by the adjacent constraint nodes are 
distributed as (u,u)i. Similarly for £ > 0, (h,h)i has the 
distribution of the messages sent from the first generation 
variables to their ancestor constraint node in a random 
T(£ + 1). This claim is a direct consequence of Eqs. (f2]), 
©, tUD, © and of the definition of T(£) and U. The 
random variables l± have, for instance, the distribution of the 



cardinalities of d±i(a) for an arbitrary edge of the random 
tree, as \di \ a\ = Poisson(a£;) and unsatisfying values 
z(i, a) of the variables are chosen independently with equal 
probability. 

Finally the expression of <^ ree (0) is obtained from dTOb 
by noting that the cardinalities of d±i for the root of T(£) 
are distributed as the ones of d±i(a) and using the global 
symmetry between and 1, which implies that on average 
the two terms of ( TTOb yield the same contribution. Note that 
the dependence on 9 of (fff ee arises through the distribution 
of (h, K)t, the bias of the coin ( used in ( TT4| ) being 9. □ 

Details on the population dynamics algorithm. The numer- 
ical procedure we followed in order to determine (f?f ee (9) 
amounts to approximating the distribution of the random 
variable (u, u)t by the empirical distribution of a large 
sample of couples {(uj,v,j)}fL 1 . A sample {(hj, hj)}fL 1 
is then generated according to Eq. ( fT~4l >: for each j G [N] 
one draws two Poisson random variables l + and l + + Z_ 
indexes jf- uniformly in [N], and a biased coin £. The j-th 
element of the sample is thus computed as 



(hj,hj) 



clla 



Subsequently the sample {(uj, Uj)} is updated from 
{(hj, hj)} by a similar interpretation of Eq. <fl3V After 
I iterations of these two steps, starting from the initial 
configuration (uj,Uj) — (0, 0) for all j £ [1, N], the estimate 
of c/)^ ee (9) is given by 



1 N 



tanh hj)hj 



(18) 



When t gets large this quantity is numerically found to 
converges to a limit we denoted (p tlee (9). □ 



Large k argument. Consider the function (f){9) defined, for 
9 G [0, 1], as the smallest solution in [0, 1] of the equation 



= 0+(l-0) l-exp 



ak 



(19) 



It can be shown that 4>(9) is a smoothly increasing function 
of 9 as long as a < S sp j n (fc), while for larger values of a a 
discontinuous jump develops in its curve. This threshold can 
be explicitly computed and reads 

k-2 



a. 



spin 



(k) 



2 k 
~~k 



1 



k-2 



(20) 



We believe this simple to determine function <fi(9) to be 
equivalent to the true (f)(9) in the large k limit, up to ex- 
ponentially small in k corrections. In fact (113ll4ll5t implies 
the following exact equation, 

ak 
~2~k 



E[h] = + (1 - 0) ( 1 - exp 



fc— i 



(21) 



where the expectation is taken in the I — > oo limit. For 
large values of k one can show the random variable h to be 
exponentially close to 0, hence <j)(9) and E[/i] coincide at 
the leading order, and by comparing (fT~9b and OTT i they also 
coincide with (f)(9). The conjecture stated in Eq. ( TT6b was 
obtained by expanding (f20b at the leading order. □ 



